the cells in the table have large counts, but it becomes unreliable when one or more cell counts is
very small (or zero). There are different recommendations as to the minimum counts you need per
cell in order to confidently use the chi-square test. A rule of thumb that many analysts use is that
you should have at least five observations in each cell of your table (or better yet, at least five
expected counts in each cell).
It’s not good at detecting trends. The chi-square test isn’t good at detecting small but steady
progressive trends across the successive categories of an ordinal variable (see Chapter 4 if you’re
not sure what ordinal is). It may give a significant result if the trend is strong enough, but it’s not
designed specifically to work with ordinal categorical data. In those cases, you should use a
Mantel-Haenszel chi-square test for trend, which is outside the scope of this book.
Modifying the chi-square test: The Yates continuity correction
There is a little drama around the original Pearson chi-square of association test that needs to be
mentioned here. Yates, who was a contemporary of Pearson, developed what is called the Yates
continuity correction. Yates argued that in the special case of the fourfold table, adding this correction
results in more reliable p values. The correction consists of subtracting 0.5 from the magnitude of the (
) difference before squaring it.
Let’s apply the Yates continuity correction for your analysis of the sample data in the earlier section
“Understanding how the chi-square test works.” Take a look at Figure 12-3, which has the differences
between the values in the observed and expected cells. The application of the Yates correction changes
the 7.20 (or –7.20) difference in each cell to 6.70 (or –6.70). This lowers the chi-square value from
8.81 down to 7.63 and increases the p value from 0.0030 to 0.0057, which is still very significant —
the chance of random fluctuations producing such an apparent effect in your sample is only about 1 in
175 (because
).
Even though the Yates correction to the Pearson chi-square test is only applicable to the
fourfold table (and not tables with more rows or columns), some statisticians feel the Yates
correction is too strict. Nevertheless, it has been automatically built into statistical software like
R, so if you run a Pearson chi-square using most commercial software, it automatically uses the
Yates correction when analyzing a fourfold table (see Chapter 4 for a discussion of statistical
software).
Focusing on the Fisher Exact Test
The Pearson chi-square test described earlier isn’t the only way to analyze cross-tabulated data.
Remember that one of the cons was that it is not an exact test? Famous but controversial statistician R.
A. Fisher invented another test in the 1920s that gives the exact p value for tables that can handle very
small cell counts (even cell counts of zero!). Not surprisingly, this test is called the Fisher Exact test
(also sometimes referred to Fisher’s exact test, or just Fisher).
Understanding how the Fisher Exact test works
Like with the chi-square, you don’t have to know the details of the Fisher Exact test to use it. If you